Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge matrix runs to fail fast globally #1216

Merged

Conversation

sjain-stanford
Copy link
Member

@sjain-stanford sjain-stanford commented Aug 12, 2022

My earlier PR had (among other things) decoupled ubuntu and macos builds into separate matrix runs. This is not working well due to limited number of MacOS GHA VMs causing long queue times and backlog. There are two reasons causing this backlog:

  1. macos arm64 builds with pytorch source are getting erratically cancelled due to resource / network constraints. This is addressed with this: Use pytorch binary for macos-arm64 workflow #1215

"macos-arm64 (in-tree, OFF) The hosted runner: GitHub Actions 3 lost communication with the server. Anything in your workflow that terminates the runner process, starves it for CPU/Memory, or blocks its network access can cause this error."

  1. macos runs don't fail-fast when ubuntu runs fail due to being in separate matrix setups. This PR couples them again.

@sjain-stanford
Copy link
Member Author

@powderluv PTAL. Marking this as Draft until we analyze the CI logs to ensure this is doing what we expect it to.

Copy link
Collaborator

@powderluv powderluv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets merge and iterate.

@sjain-stanford sjain-stanford marked this pull request as ready for review August 12, 2022 18:29
@sjain-stanford sjain-stanford merged commit aed0ec3 into llvm:main Aug 12, 2022
qedawkins pushed a commit to nod-ai/torch-mlir that referenced this pull request Oct 3, 2022
Signed-off-by: Ettore Tiotto <etiotto@ca.ibm.com>
@sjain-stanford sjain-stanford deleted the sambhav/combine_matrix_workflows branch November 10, 2022 19:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants